Disclaimer

  • This is about the general method for making these types of inferences.
  • I don’t actually know the details of facebook’s particular implementation.
  • Or how buzzfeed grades their quizzes.

target

New York Times

Cambridge University, Psychometrics Centre

  • Michal Kosinski, a psychologist with the Psychometrics Centre at Cambrige University
  • 2008 - looking at Facebook quizzes that focused on what psychologists call “The Big Five” personality traits: openness, conscientiousness, extroversion, agreeableness, and neuroticism, or OCEAN.
  • “The quiz-takers could opt in to sharing their data with the researchers, and to the researchers’ surprise, millions did.”

  • 2012 - “Kosinski proved that on the basis of an average of 68 Facebook ‘likes’ by a user, it was possible to predict:
    • their skin color (with 95 percent accuracy)
    • their sexual orientation (88 percent accuracy)
    • their affiliation to the Democratic or Republican party (85 percent)
  • Also: intelligence; religion; alcohol, cigarette and drug use, whether someone’s parents were divorced
  • ‘Liking’ Wu-Tang Clan is one of the best indicators for heterosexuality
  • ‘Liking’ Lady Gaga correlated highly with extroversion.
  • “Eventually, Facebook made ‘like’ data private, but researchers could still collect it by asking users to opt in.”

Politics

Politics

“So what does all this have to do with elections? In 2014, a young assistant professor named Aleksandr Kogan requested access to Kosinski’s database on behalf of an “election management agency” based in London called Strategic Communications Laboratories. Kosinski turned Kogan down, but Kogan went ahead and registered a company under SCL’s umbrella called Cambridge Analytica — an homage, he said, to the university’s work in the field. Cambridge Analytica, under CEO Alexander Nix, went on to work for the pro-Brexit campaign, Senator Ted Cruz’s presidential nomination bid, and then on Donald Trump’s presidential campaign.”

“On the day of the third presidential debate between Trump and Clinton, Trump’s team tested 175,000 different ad variations for his arguments”

Back To Facebook

Washington Post

It’s not all bad

  • Medical care, e.g. diabetes
  • Education, e.g. adaptive testing

Measurement

  • You take a quiz, and get a 78%. What does this mean?

  • You take a survey and find out you are 63% democrat, or 71% republican? What does this mean?

A Definition

  • What does it mean to measure something?

According to Stevens (1946), measurement is “the assignment of numerals according to rules”

Another Definition

According to Wright (1997), measures should be:

  1. unidimensional

  2. sample-independent
    • what we measure should not depend on what we use to measure it
    • individuals as well as item characteristics
  3. invariantly comparable
    • Higher “ability” means higher probability of answering a question correctly, no matter the question
  4. additive
    • Difference in ability is independent of items used, and difference in difficulties should be independent of the people being measured

Classical Test Theory

  • The “old method”
  • Focused on total scores

Psychometrics: A Probabilistic Framework

  • Questions have “difficulties”
  • People have “abilities”
  • Questions can have other attributes (depending on the model)

Messy notation:

  • \(\theta_i\) - person’s \(i\)’s ability
  • \(b_j\) - question \(j\)’s difficulty
  • \(P_{i,j}\) - person \(i\)’s probability of answering question \(j\) correctly
    • Technically \(P(X_{i,j}=1|\theta_i,b_j)\)

More measurement

  • Types of variables: nominal, ordinal, scale, ratio
  • What operations make sense for each?
  • Likert questions

Differences

  • What does \(P_{a,j} – P_{a,k}\) mean?
  • What does \(P_{i,b} - P_{j,b}\) mean?
  • What does \(\theta_p – \theta_q\) mean?
  • What does \(b_p-b_q\) mean?
  • What values could each of these differences take?

What should the curves look like?

  • What would we expect the curves relating probability of answering a question correctly / agreeing with a statement vs ability to look like?

What do the curves look like?

What should a curve relating the abilities and probabilities look like? (Use ability as the independent variable)

2

What kind of function?

  • \(\theta_A\) - \(\theta_B\) should be meaningful, and should not depend on the items used (specific objectivity)

  • How are \(P_{Ai}\) and \(P_{Bi}\) related to \(\theta_A\) and \(\theta_B\)? Could it be that \(P_{Ai} - P_{Bi}\) = \(\theta_A - \theta_B\)?
    • Domains: \(\theta_{A}\) can be any real number, \(P_{Ai}\) is between 0 and 1
  • Let’s fix it
    • Consider odds: \(D_{Ai} = \frac{P_{Ai}}{1-P_{Ai}}\)
    • \(D_{Ai}\) lies between 0 and \(\infty\).
    • \(\ln(D_{Ai})\) has domain \(-\infty\) to \(\infty\).
  • Now, maybe \(\ln(D_{Ai})-\ln(D_{Bi}) = \theta_A - \theta_B\), or equivalently, \(\ln \left( \frac{P_{Ai}}{1-P_{Ai}} \right)-\ln \left( \frac{P_{Bi}}{1-P_{Bi}} \right) = \theta_A - \theta_B\)

What kind of function?

  • What about the same person answering two questions?
  • By similar logic to the last slide, it is reasonable that \[ b_m-b_n = \ln \left( \frac{P_{Am}}{1-P_{Am}} \right) - \ln \left( \frac{P_{An}}{1-P_{An}} \right) \]

  • Now difficulty and ability are on the same scale

Putting things together

\(\theta_A - \theta_B = \ln \left( \frac{P_{Ai}}{1-P_{Ai}} \right)-\ln \left( \frac{P_{Bi}}{1-P_{Bi}} \right)\) means \(\theta_A = \ln \left( \frac{P_{Ai}}{1-P_{Ai}} \right) + C_1\)

and

\(b_m-b_n = \ln \left( \frac{P_{Am}}{1-P_{Am}} \right) - \ln \left( \frac{P_{An}}{1-P_{An}} \right)\) means \(b_m = \ln \left( \frac{P_{Am}}{1-P_{Am}} \right) + C_2\)

This means \(C_1 = b_m\) and \(C_2 = \theta_A\). Therefore, \(\theta_A - b_m = \ln \left( \frac{P_{Am}}{1-P_{Am}} \right)\)

Solve for \(P_{Am} = \dfrac{exp(\theta_A-b_m)}{1+exp(\theta_A-b_m)}\)

\[ P_{i,j} = \dfrac{e^{\theta_i-b_j}}{1+e^{\theta_i-b_j}} \]

This is the probability that person \(i\) will correctly answer question \(j\) correctly.

A few items

Some analysis!

  • Analyzing the test
  • Analyzing the results

Questions

  • I quickly feel drained when in a large crowd of people
  • I am a cautious decision maker
  • I feel drained after being out and about, even if I’ve enjoyed myself
  • I enjoy large gatherings of people
  • I don’t take risks unless I’ve done some careful research or evaluation first
  • When I was a child, people described me as “quiet”
  • In large social gatherings, I often feel a need to seek out space to be by myself
  • If possible, I would spend every single moment by myself
  • I like to spend at least a few moments alone every day
  • I prefer to work alone than with others

Rasch

##                             Dffclt Dscrmn   P(x=1|z=0)
## few_moments_alone     -2.393858933      1 9.163578e-01
## cautious              -1.472960963      1 8.135070e-01
## drained_after_out     -1.131583530      1 7.561310e-01
## enjoy_large_gather    -1.131124476      1 7.560464e-01
## work_alone            -0.542596614      1 6.324162e-01
## quiet_child           -0.006237960      1 5.015595e-01
## large crowd           -0.006014405      1 5.015036e-01
## risk_careful_research  0.259495753      1 4.354877e-01
## space_myself           0.531349428      1 3.702022e-01
## every_moment_alone    25.566068525      1 7.884924e-12

Graph

1PL

##                             Dffclt    Dscrmn   P(x=1|z=0)
## few_moments_alone     -2.908212570 0.7938469 9.095930e-01
## cautious              -1.780270664 0.7938469 8.042800e-01
## enjoy_large_gather    -1.364893865 0.7938469 7.471589e-01
## drained_after_out     -1.364760881 0.7938469 7.471390e-01
## work_alone            -0.650823089 0.7938469 6.263650e-01
## large crowd           -0.003267129 0.7938469 5.006484e-01
## quiet_child           -0.002924083 0.7938469 5.005803e-01
## risk_careful_research  0.317258943 0.7938469 4.373670e-01
## space_myself           0.645493587 0.7938469 3.746257e-01
## every_moment_alone    32.205290267 0.7938469 7.884924e-12

Graph

2PL

##                              Dffclt        Dscrmn   P(x=1|z=0)
## drained_after_out     -4.979436e+00  1.942927e-01 7.246145e-01
## few_moments_alone     -2.487551e+00  9.835886e-01 9.203217e-01
## work_alone            -4.576220e-01  1.347980e+00 6.495053e-01
## quiet_child            5.307843e-03  1.933437e+00 4.974344e-01
## large crowd            1.068790e-02  2.401956e+00 4.935824e-01
## space_myself           3.577605e-01  3.371001e+01 5.785727e-06
## risk_careful_research  4.701684e-01  4.833434e-01 4.434300e-01
## enjoy_large_gather     7.909974e-01 -1.910641e+00 8.192555e-01
## cautious               7.296065e+00 -1.723285e-01 7.785644e-01
## every_moment_alone     1.666318e+16  3.934787e-15 3.349795e-29

Plots

3PL

##                             Gussng        Dffclt        Dscrmn
## drained_after_out     1.803486e-04 -4.506472e+00  2.150155e-01
## few_moments_alone     1.997998e-01 -2.151861e+00  1.009364e+00
## work_alone            7.858138e-16 -4.654201e-01  1.306085e+00
## quiet_child           3.392461e-94  6.801553e-03  2.009522e+00
## large crowd           0.000000e+00  1.222585e-02  2.503847e+00
## risk_careful_research 3.273587e-59  4.073067e-01  5.699412e-01
## enjoy_large_gather    2.000000e-01  4.084517e-01 -2.892699e+00
## space_myself          0.000000e+00  4.376223e-01  9.762449e+03
## cautious              1.067525e-01  1.235845e+01 -8.942770e-02
## every_moment_alone    0.000000e+00  1.077452e+08  2.372830e-07
##                         P(x=1|z=0)
## drained_after_out     7.249620e-01
## few_moments_alone     9.181457e-01
## work_alone            6.474567e-01
## quiet_child           4.965831e-01
## large crowd           4.923477e-01
## risk_careful_research 4.422240e-01
## enjoy_large_gather    8.121779e-01
## space_myself          0.000000e+00
## cautious              7.777877e-01
## every_moment_alone    7.884739e-12

Plots

What does this mean for you?

We could either compare the models statistically, or try to make sense of which model might be reasonable.

What model might be good, and why?

Let’s consider the Rasch model

##                             Dffclt Dscrmn
## few_moments_alone     -2.393858933      1
## cautious              -1.472960963      1
## drained_after_out     -1.131583530      1
## enjoy_large_gather    -1.131124476      1
## work_alone            -0.542596614      1
## quiet_child           -0.006237960      1
## large crowd           -0.006014405      1
## risk_careful_research  0.259495753      1
## space_myself           0.531349428      1
## every_moment_alone    25.566068525      1

This means the curve for “large_crowd” is:

\[\frac{exp(\theta+0.006014405)}{1+exp(\theta+0.006014405)}\] etc.

A hypothetical person

Suppose I said “agree” to everything before “enjoy_large_gather” and “disagree” to everything after

I’m indicating that I follow the curve for the early ones, but not the later ones.

The likelihood function

Suppose we have functions \(p_n(\theta) = \frac{exp(\theta - \beta_n)}{1+exp(\theta-\beta_n)}\)

The “likelihood function” for my introversion is \(p_1(\theta) \times p_2(\theta) \times p_3(\theta) \times p_4(\theta) \times (1-p_5(\theta)) \times (1-p_6(\theta)) \times (1-p_7(\theta)) \times (1-p_8(\theta)) \times (1-p_9(\theta)) \times (1-p_{10}(\theta))\)

What does this actually look like???

Actual Scores

##  [1] -1.2888267 -0.5864932 -0.9332466 -0.5864932  0.4855150 -0.9332466
##  [7] -0.2404238 -0.2404238  0.1135676 -0.2404238 -0.2404238  0.8888117
## [13]  0.4855150  0.1135676  0.4855150  0.1135676  0.8888117  1.3411824